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ABSTRACT 

Available designs and instruments for evaluation are 
not adeguate to meet the varied needs of evaluation. Therefore, 
evaluation needs to be reshaped in terms of 1) consideration of how 
and where currently available theories, designs, and instruments are 
proving useful, 2) identification of needs that cannot be met ith 
currently available construct;; and tools, and 3) an attempt to 
identify guidelines for efforts to meet unfulfilled needs, on this 
context, the history of educational testing and evaluation is briefly 
reviewed and some interesting new ideas noted. The rather recent 
concepts of formative versus summative evaluation, of fidelity versus 
bandwidth of information and of group evaluation versus individual 
evaluation miaht be of help to reshape evaluation positively if the 
needs for evaluation can be examined within a framework of 
educational decision making. (CK) 
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Symposium: "The World of Evaluation Needs Re -shaping"* 



Evaluation Designs and Instruments 
Jack C. Mervln 
University of Minnesota 



I was happy to accept the chairman's Invitation to participate In 
this symposium because I felt the title reflected many of my personal 
biases. Within the framework of our frustrations with available designs 
and Instruments which do not meet many of out varied needs for evaluation, 
the term re-shaplng Implies to me, l) consideration of where and how 
currently available theories, designs and Instruments are proving useful, 

2) Identification of needs that cannot be met with currently available 
constructs and tools, and 3) an attempt to Identify guidelines for efforts 
to meet unfulfilled needs. 

In my brief comments this morning, I will attempt to put the dimensions 
of our current needs In a historical perspective. The most promising aspect 
of current frustration la tha long overdue recognition that we can no longer 
live with the totally unrealistic Idea that a small number of designs and 
a very United variety of evaluative Instruments can serve all of our 
needs for evaluation in education. 

I view the following as encouraging elgns of movement and trends toward 
the needed reshaping of the world of evaluation as It relates to evaluating 
Individual!: 
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1. Emphasis on measuring change, rather than status, many problems of 
which are brought out In a report of the Wisconsin Symposium, 
Problems In Measuring Change , edited by Chester Harris. 

2. Explorations of the use of sequential procedures for gathering 
information, as opposed to across the board administration of 
Instruments . 

3. Experimentation with placement tests, "imbedded" Items and 
proficiency tests as part of the learning process, such as that 
of the Oakleaf Project of Glaser and his associates. 

On the latter of these points, It Is Interesting to note something 
similar from the past. Monroe's book of 1918, Measuring the Results of 
Teaching , carried a focus on mastery of skills related to very specific 
objectives . 

Our evaluation efforts In recent decades have focused on evaluation 
of the Individual and Indeed there Is further development and reshaping 
needed In this area. But there have been other needs for evaluation which 
have gone largely unheeded for some time. In his paper "Course Improvement 
Through Evaluation," Lee Crotibach describes the situation In this way: 

Many types of decisions are to be made, and many varieties of 
Information are useful. It becomes Immediately apparent that 
evaluation Is a diversified activity and that no one set of 
principles will suffice for ell situations. But measurement 
specialists have so concentrated upon one procees--the preparation 
of pencll-and»paper achievement tests for assigning scores to 
Individual puplla»*that the principles pertinent to that process 
have somehow become enshrined as the principles of evaluation. 
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Much recent concern has not been with evaluation of Individuals but 
with evaluation of programs; instruction, curriculuu, methodology and so 
forth. Looking to the past first, we note that at the turn of the century 
there was a similar concern. Rice's classic study of the 1890's was aimed 
at a comparison of outcomes of different approaches to teaching the same 
subject. The 1916 NSSE Yearbook was entitled Standards and Tests for 
Measurement of the Efficiency of Schools and School Systems . That same 
year, Arnold produced a book entitled Measurement of Teaching Efficiency . 

In 1918, Monroe authored a book entitled Measuring Uie Results of Teaching , 
and the NSSE Yearbook for that year was The Measurement of Educational 
Products . It was with the background of design and instrumentation set 
forth in such books that the great expansion of achievement testing took 
place in the 1920's. 

I believe Cronbach hit upon the basic reason for many of our frustrations 
today as we look to currently available designs end Instruments for program 
evaluation. He wrote, 

At that time (1920), the content of any courae was taken pretty 
much as established and beyond criticism save for small shifts of 
topical emphasis. At the administrator's discretion, standard 
tests covering the curriculum were given to assess the efficiency 
of the teacher or the school system. Such administrative testing 
fell into disfavor when used injudiciously and heavy handily in the 
1920's and 1930's. Adminlitrators and accrediting agencies fell back 
upon descriptive features of the school program in judging adequacy. 
Instead of collecting direct evidence of educational Impact, they 
judged schools in terms of site of budget, student-staff ratio, 
square feet of laboratory space, and the number of advanced credits 
accumulated by the teacher. 
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In this article from the Teachers College Record In 1963, Cronbach*6 

» ' 

next sentence is "This tide, it appears, is about to turn." Today we are 
looking at the needs for evaluation designs and Instruments from a somewhat 
different view than our predecessors of the 1920 era. We are concerned 
not only with effectiveness of teaching, but also the effectiveness of 
"Innovations" In ill aspects of education. 

Since the 1930's testing has been almost exclusively designed for 
judgments about Individuals. Summary figures across scores for individual 
have provided some information regarding program effectiveness. We have 
been all too long, however, In cooing to the realization that this approach 
often Is not only Inefficient; but simply does not provide some of the 
Information needed. Thus, whether ve attribute it to requirements for 
evaluation written Into federal legislation, nev approaches to teaching, 
or numerous curriculum development projects, the pressure has mounted to 
produce a healthy concern about tho need for reshaping evaluation 
methodology and Instruments to Implement that methodology. 

Irritating as It Is to face broadened evaluation needs and find that 
available tools will simply not do the job, several types of activity 
already started indicate movement In promising directions. 

One such activity that 1 would cite Is the proposed use of a 
declslon-maklng framework as a battle for thinking about evaluation. 

Stuff lebeaa has been working specifically on educational decision making 
as a framework, and Cronbach and Glaser earlier had aet forth a general 
background. Stake's paper, "The Countenance of Educational Evaluation" 
provided a refreshing nev view. The attention being given to mastery 
testing by Olaser et al at Pittsburgh and Bloom In Chicago, along with 
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the work on "Universe-defined" tests by Osborne and by Hlvely have been 
Interesting new developments. Cronbach's proposal for an unmatched deslgp 
for collecting Information from groups should be Included In this list, 
as should the efforts toward unique designs and Instrumentation that has 
been under development by the Committee on Assessing the Progress of 
Education. And, 1 should not end this listing without mentioning 
the AERA Committee on Curriculum Evaluation and the monograph series 
started by that Committee. 

1 also want to mention some concepts of relatively recent vintage 
that have not been In the focus of design and Instrument development, 
but which may well help us In reshaping of the world of evaluation around 
design and Instrumentation. One Is the distinction between formative and 
sumnatlve evaluation set forth by Scrlven. A second is the concept of 
fidelity versus bandwidth of Information suggested by Cronbach and Gleser. 

A third Is the general Idea of group evaluation as opposed to Individual 
evaluation. And, finally, I would propose that all of such concepts might 
most readily move us toward a positive reshaping of evaluation If our needs 
for evaluation can be examined within the framework of educational decision 
making. , 
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